Skip to content

Add set data test cases #61

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Aug 13, 2024
Merged

Add set data test cases #61

merged 15 commits into from
Aug 13, 2024

Conversation

stevenhua0320
Copy link
Contributor

Note that I still make the case x.size > 0 and res ==0 will give a message because as before Luke said that this case would make trivial clustering, which would make every data point as its own cluster, which is not desired clustering that we want.

Copy link
Contributor

@sbillinge sbillinge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice job! Please see comments.


# In the set data test, we test for these cases.
# (1) x and y are non-empty array values, and res is positive (the most generic case)
# (2) x and y are non-empty array values, and res is 0 (will produce a msg that makes trivial clustering)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not raise errors or warnings for this. It is just the identity (gives back the input unchanged) which is not that useful but not anything invalide. Please make an issue to address this in the docs, but not in the code.

)
def test_set_data(inputs, expected):
actual = DataClusters(x=inputs["input_x"], y=inputs["input_y"], res=inputs["input_res"])
assert actual == expected
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a note, the expected we would like to explicitly set all its attributes. this will also implicitly test other parts of the code that generate those attributes.

(
# case (4)
{
"input_x": np.array([]),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as I mentioned before, please use "x" here, not "input_x" to make the code more readable. Also, don't make the x-array empty, but just make them have different lengths.

"Sequences x and y must have the same length.",
),
(
# case (5)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not needed. Please delete.

"input_y": np.array([3]),
"input_res": -1,
},
"Resolution res must be non-negative.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please expand a bit. Will the user know what "resolution" means?

"Resolution res must be non-negative.",
),
(
# case (2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not needed, please remove. This will be handled in docs not in code.

),
],
)
def test_set_data_order_bad(inputs, msg):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a good test, nice job.

DataClusters(np.array([1, 2, 3]), np.array([3, 2, 1]), 4),
),
(
# case (6)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not needed (because we are not catching res 0 any more)

# (4, 5) One of x and y is empty array, and res is positive
# (produce ValueError & msg "Sequences x and y must have the same length.", something like that)
# (6) Both x and y are empty array, and res is zero.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove these blank lines so the tests are as compact as possible.

# (2) x and y are non-empty array values, and res is 0 (will produce a msg that makes trivial clustering)
# (3) x and y are non-empty array values, and res is negative (will produce a ValueError,
# msg = please enter a non-negative res value)
# (4, 5) One of x and y is empty array, and res is positive
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I woud test for len(x) != len(y) (ValueError) and not worry about the other cases. The user won't give a len(0) array on purpose and the code will error in that case anyway (on sort for example) in case the user does it inadvertently.

Copy link
Contributor

@sbillinge sbillinge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please see my inline comments. I also made some mods to the error messages, and also put the docstrings into numpy format. Please can you take a look and try and use the same pattern for all docstrings.

The main issue remaining is that test_set_data is not testing anything atm.

"y": np.array([3, 2, 1]),
"res": 4,
},
DataClusters(np.array([1, 2, 3]), np.array([3, 2, 1]), 4),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't appear to be testing anything as your actual and expected are both just running the same functions.

Copy link
Contributor

@sbillinge sbillinge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks this is good. Please see the comment

"DONE": 3,
"lastcluster_idx": None,
"status": 1,
}
),
],
)
def test_set_data(inputs, expected):
actual = DataClusters(x=inputs["x"], y=inputs["y"], res=inputs["res"])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is better. Strinctly this still doesn't test set_data alone. It tests the object constructor. I think this is ok, but we may want to make clear and set_data as private functions. Then we don't need tests (or docstrings in prinicple) for them and we just test the constructor (the __init__).

Whether or not to do this depends where else these functions rae used. Do we want to make them available to users to use, or are they just being used in init alone or in init and a few other places.

These are small things, but once we touch the code we want to leave it better than when we arrived, and it is also a good learning experience.....

Could you look how these two functions are used and we can decide. If we make them private functions we can leave this test as it is but just change its name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is better. Strinctly this still doesn't test set_data alone. It tests the object constructor. I think this is ok, but we may want to make clear and set_data as private functions. Then we don't need tests (or docstrings in prinicple) for them and we just test the constructor (the __init__).

Whether or not to do this depends where else these functions rae used. Do we want to make them available to users to use, or are they just being used in init alone or in init and a few other places.

These are small things, but once we touch the code we want to leave it better than when we arrived, and it is also a good learning experience.....

Could you look how these two functions are used and we can decide. If we make them private functions we can leave this test as it is but just change its name.

I'm certain that these two functions are only used in the constructor. It should be OK to change them into private functions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super, let's do this. Just

  • change the name of this test to something like test_DataClusters_constructor since this is what it does anyway.
  • add an underscore to the beginning of the clear and set_data functions. You can leave the docstring since we already wrote it.
  • revisit the test_clear tests. We want to remove this test, but let's make sure that this behavior is being tested in the constructor test, so move over anything we need from there.

@sbillinge
Copy link
Contributor

I made a few final changes and merged. Nice job there. Please make sure to check any docstrings on functions you touch are in the numpy format (see my examples) and also take these lessons forward about the tests, but nice job!

Please make an issue to remove the _clear() function sometime in the future (preferably when we have more tests). Since it is only used in the constructor it will never encounter a version of the object that is not empty. for now, leave it as an issue as we want to do more functional testing before we make any possibly breaking changes when there are few tests in the code.

The focus now is for you to successfully use the code in its current state.

@sbillinge sbillinge merged commit d1bf3d4 into diffpy:Cookie Aug 13, 2024
3 checks passed
sbillinge added a commit that referenced this pull request Sep 4, 2024
* Lint check & fix to python3 format (#18)

* lint check and change files from python2 to python3

* pre-commit check for these files

* lint check & change to python3 & pre-commit check (#19)

* lint check & change to python3 & pre-commit check

* [pre-commit.ci] auto fixes from pre-commit hooks

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* lint check and fix print and exception python2 issue (#20)

* lint check and fix python2 print and exception issues (#21)

* lint check and fix python2 print and exception issues

* [pre-commit.ci] auto fixes from pre-commit hooks

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* finish parenthesizing print statements (#24)

* finish parenthesizing print statements

* [pre-commit.ci] auto fixes from pre-commit hooks

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* fix too many leading #, import modules, and unused var (#29)

* requirements (#30)

* fix import module not used & string check (#25)

* fix too many leading "#" in string block (#26)

* lint check, remove unused import modules & remove too many "#". (#27)

* remove unused modules, ambiguous variable name (#28)

* cleaning (#31)

* requirements

* clean out __init__

* replace ###

* ins not none in modelevaluators base

* Copyright (#32)

* requirements

* basefunction

* all the copyright statements

* lint check, fix break import modules, remove unused import modules, remove some # (#33)

* fix break import modules, remove unused import modules, fix docstring length (#34)

* fix formatting issue and typo in copyright (#35)

* clean out inits (#38)

* clean out inits

* [pre-commit.ci] auto fixes from pre-commit hooks

* dataclusters.py, modelevaluators/aicc and modelparts.py

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* peakextraction.py and init (#40)

* move untrack doc and requirement files (#41)

* move untrack doc and requirement files

* add requirement in run.txt

* [pre-commit.ci] auto fixes from pre-commit hooks

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* add pyproject.toml (#42)

* add pyproject.toml

* [pre-commit.ci] auto fixes from pre-commit hooks

* update classifiers pyproject.toml

* Delete setup.py

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* move diffpy files to src dir (#44)

* move diffpy files to src dir

* [pre-commit.ci] auto fixes from pre-commit hooks

* add Luke to authors

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* LICENSE (#45)

* add two LICENSE.rst files into cookiecutter

* fix LICENSE.rst and LICENSE_PDFgui.rst with correct references and year

* resolve pdfdataset.py conflict

---------

Co-authored-by: Simon Billinge <[email protected]>

* add untrack files and add cookiecut.rst news (#46)

* add untrack files and add cookiecut.rst news

* delete README.txt

* fix py2 -> py3, fix broken import, remove deprecation warning (#47)

* fix py2 -> py3, move deprecation warning

* fix search & split in binary files

* fix broken import, remove deprecated pkg_resource (#50)

* change import path to make it work. (#48)

* fix import modules, py2->py3 (#49)

* fix broken import in doc, change README to rst file. (#51)

* fix broken import in doc, change README to rst file.

* fix os getcwd method

* fix p2 to p3 (#52)

* add test for dataclusters (#54)

* add test for dataclusters

* define eq method in dataclusters.py

* change parametrization form

* add one more case and change reference name to actual

* delete comment

* add two more tests for DataClusters class function.

* change in docstring for clearer explanation for clear method, remove duplicated case for testing behavior, remove other tests.

* change clear method docstring into numpydoc format. Delete dtype for numpy array.

* remove block

* Make edition to condition on res, refactor for setdata to make behavior of the test passed.

* change condition on res

* add condition on x and res are incompatible, update test.

* revert change in setdata method.

* Eq tests (#59)

* remove diffpy/srmise tree

* test for eq

* add attributes in eq method (#60)

* Add set data test cases (#61)

* add test cases to test files and make edition to make sure the behavior of the test pass.

* [pre-commit.ci] auto fixes from pre-commit hooks

* change case in test__eq__ to be compatible with the behavior of setdata

* delete text and redundant tests

* tweaking error message in DataClusters

* [pre-commit.ci] auto fixes from pre-commit hooks

* update test for checking implicit attributes for setdata function

* [pre-commit.ci] auto fixes from pre-commit hooks

* update test for setdata function

* update setdata test to right format.

* update to constructor test & make setdata clear function private

* final tweaks to tests by Simon

* fix actual_attribute typo

* final refactor of actual_attr

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Simon Billinge <[email protected]>

* fix arbitrary.py to numpydoc format (#68)

* fix arbitrary.py to numpydoc format

* pre-commit fix

* change start sentence to 'The'

* print things correctly (#71)

* print things correctly

* change to f string

* reduce print to one line

* change createpeak to actualize function (#72)

* fix import and counting to make it work (#74)

* refactor makeclusters to make it work (#73)

* deprecation remove (#78)

* deprecation remove

* fix to right behavior

* Revert "refactor makeclusters to make it work (#73)" (#79)

This reverts commit 3773bcf.

* try out py2 before py3 refactor to make sure correct workflow (#75)

* fix false counting and numpy to int (#80)

* fix false counting and numpy to int

* [pre-commit.ci] auto fixes from pre-commit hooks

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* numpydoc edition (#81)

* change peakextraction function to numpydoc

* pre-commit run

* remove unused import

* numpydoc build (#82)

* numpydoc build on peakstability (#83)

* numpydoc build for ModelCluster (#85)

* numpydoc build for multimodelselection.py (#87)

* numpydoc documentation build for ModelCluster class (#86)

* numpydoc build for pdfdataset (#88)

* numpydoc build for pdfpeakextraction.py (#89)

* numpydoc build for gaussianoverr.py (#91)

* numpydoc build for gaussianoverr.py

* [pre-commit.ci] auto fixes from pre-commit hooks

* fix pre-commit

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* terminationripples.py numpydoc build (#92)

* numpydoc build for gaussian.py (#90)

* numpydoc build for gaussian.py

* [pre-commit.ci] auto fixes from pre-commit hooks

* pre-commit fix

* update for FWHM and maxwidth

* update for starting sentence

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* numpydoc build for base.py (#95)

* numpydoc build for polynomial.py (#97)

* numpydoc build for fromsequence.py (#99)

* numpydoc build for nanospherical.py (#98)

* numpydoc build for base.py in Baseline class (#96)

* numpydoc build for base.py

* [pre-commit.ci] auto fixes from pre-commit hooks

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* numpydoc build for aic.py (#93)

* numpydoc build for aicc.py (#94)

* numpydoc build for ModelCovariance (#84)

* numpydoc build for ModelCovariance

* update format type and fix indentation issue

* numpydoc build for modelparts.py (#100)

* numpydoc build for modelparts.py

* [pre-commit.ci] auto fixes from pre-commit hooks

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* numpydoc build for basefunction.py (#101)

* api workflow build for diffpy.srmise (#102)

* api workflow build for diffpy.srmise

* [pre-commit.ci] auto fixes from pre-commit hooks

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* add changed news (#103)

---------

Co-authored-by: Rundong Hua <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants